Gibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation
نویسندگان
چکیده
Nonparametric Bayesian approaches to clustering, information retrieval, language modeling and object recognition have recently shown great promise as a new paradigm for unsupervised data analysis. Most contributions have focused on the Dirichlet process mixture models or extensions thereof for which efficient Gibbs samplers exist. In this paper we explore Gibbs samplers for infinite complexity mixture models in the stick breaking representation. The advantage of this representation is improved modeling flexibility. For instance, one can design the prior distribution over cluster sizes or couple multiple infinite mixture models (e.g., over time) at the level of their parameters (i.e., the dependent Dirichlet process model). However, Gibbs samplers for infinite mixture models (as recently introduced in the statistics literature) seem to mix poorly over cluster labels. Among others issues, this can have the adverse effect that labels for the same cluster in coupled mixture models are mixed up. We introduce additional moves in these samplers to improve mixing over cluster labels and to bring clusters into correspondence. An application to modeling of storm trajectories is used to illustrate these ideas.
منابع مشابه
PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.
Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they...
متن کاملSome Further Developments for Stick-breaking Priors: Finite and Infinite Clustering and Classification
SUMMARY. The class of stick-breaking priors and their extensions are considered in classification and clustering problems in which the complexity, the number of possible models or clusters, can be either bounded or unbounded. A conjugacy property for the ex tended stick-breaking prior is established which allows for informative characterizations of the priors under i.i.d. sampling, and which fu...
متن کاملBayesian learning of joint distributions of objects
There is increasing interest in broad application areas in defining flexible joint models for data having a variety of measurement scales, while also allowing data of complex types, such as functions, images and documents. We consider a general framework for nonparametric Bayes joint modeling through mixture models that incorporate dependence across data types through a joint mixing measure. Th...
متن کاملMarkov chain Monte Carlo methods for Dirichlet process hierarchical model
Inference for Dirichlet process hierarchical models is typically performed using Markov chain Monte Carlo methods, which can be roughly categorised into marginal and conditional methods. The former integrate out analytically the infinite-dimensional component of the hierarchical model and sample from the marginal distribution of the remaining variables using the Gibbs sampler. Conditional metho...
متن کاملGibbs Sampling Methods for Stick-Breaking Priors
A rich and exible class of random probability measures, which we call stick-breaking priors, can be constructed using a sequence of independent beta random variables. Examples of random measures that have this characterization include the Dirichlet process, its two-parameter extension, the two-parameter Poisson–Dirichlet process, nite dimensional Dirichlet priors, and beta two-parameter pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1206.6845 شماره
صفحات -
تاریخ انتشار 2006